Learning Partially Observable Models Using Temporally Abstract Decision Trees

نویسنده

  • Erik Talvitie
چکیده

This paper introduces timeline trees, which are partial models of partially observable environments. Timeline trees are given some specific predictions to make and learn a decision tree over history. The main idea of timeline trees is to use temporally abstract features to identify and split on features of key events, spread arbitrarily far apart in the past (whereas previous decision-tree-based methods have been limited to a finite suffix of history). Experiments demonstrate that timeline trees can learn to make high quality predictions in complex, partially observable environments with high-dimensional observations (e.g. an arcade game).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Policies in Partially Observable MDPs with Abstract Actions Using Value Iteration

While the use of abstraction and its benefit in terms of transferring learned information to new tasks has been studied extensively and successfully in MDPs, it has not been studied in the context of Partially Observable MDPs. This paper addresses the problem of transferring skills from previous experiences in POMDP models using high-level actions (options). It shows that the optimal value func...

متن کامل

Using decision trees to select the grammatical relation of a noun phrase

Abs t rac t We present a machine-learning approach to modeling the distribution of noun phrases (NPs) within clauses with respect to a finegrained taxonomy of grammatical relations. We demonstrate that a cluster of superficial linguistic features can function as a proxy for more abstract discourse features that are not observable using state-of-the-art natural language processing. The models co...

متن کامل

Using Options for Knowledge Transfer in Reinforcement Learning

One of the original motivations for the use of temporally extended actions, or options, in reinforcement learning was to enable the transfer of learned value functions or policies to new problems. Many experimenters have used options to speed learning on single problems, but options have not been studied in depth as a tool for transfer. In this paper we introduce a formal model of a learning pr...

متن کامل

Deterministic-Probabilistic Models For Partially Observable Reinforcement Learning Problems

In this paper we consider learning the environment model in reinforcement learning tasks where the environment cannot be fully observed. The most popular frameworks for environment modeling are POMDPs and PSRs but they are considered difficult to learn. We propose to bypass this hard problem by assuming that (a) the sufficient statistic of any history can be represented as one of finitely many ...

متن کامل

Partially Observable Markov Decision Processes with Behavioral Norms

This extended abstract discusses various approaches to the constraining of Partially Observable Markov Decision Processes (POMDPs) using social norms and logical assertions in a dynamic logic framework. Whereas the exploitation of synergies among formal logic on the one hand and stochastic approaches and machine learning on the other is gaining significantly increasing interest since several ye...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012